In [62]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.style.use('ggplot')
This worksheet will walk you through the basic process of preparing a visualization using Python/Pandas/Matplotlib.
For this exercise, we will be creating a line plot comparing the number of hosts infected by the Bedep and ConfickerAB Bot Families in the Government/Politic sector.
The data we will be using is in the dailybots.csv file which can be found in the data folder. As is common, we will have to do some data wrangling to get it into a format which we can use to visualize this data. To do that, we'll need to:
| date | ConflikerAB | Bedep | |
|---|---|---|---|
| 0 | 2016-06-01 | 255 | 430 |
| 1 | 2016-06-02 | 431 | 453 |
The way I chose to do this in the answer notebook, might be a little more complex, but I wanted you to see all the steps involved.
In [ ]:
In [ ]:
The default plot doesn't look horrible, but there are certainly some improvements which can be made. Try the following:
There are many examples in the documentation which is available: http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html
A few hints: http://stackoverflow.com/questions/4700614/how-to-put-the-legend-out-of-the-plot http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.html
In [ ]:
In [58]:
from bokeh.plotting import output_notebook
output_notebook()
In [59]:
from bokeh.charts import ... #Your code here..
from bokeh.io import show
In [ ]: